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SUPPLEMENTARY FIGURES 

Supplementary Figs. 1-4 Damage patterns in single-stranded (SSL) and 
double-stranded libraries (DSL) 

Damage and fragmentation plots were generated for each sample and type of 
experiment using mapDamage (Ginolhac et al. 2011, Jonsson el at. 2013). As 
expected we observed different patterns between SSL and DSL experiments. 
More specifically, the DSL showed C to T misincorporations at the 5' end and 
complementary G to A mismatches at the 3' end of the DNA molecules (Briggs et 
al., 2007), whereas the SSL showed C to T mismatches at both ends of the 
molecules as expected (Meyer et al. 2012; Gansauge and Meyer 2013). The level 
of damage varied across samples and those with higher damage also tended to 
have lower endogenous content. 



3 



Fig. SI 
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Fig. S2 
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Fig. S3 
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Fig. S4 
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Fig.S5 

Read length distribution of pre- and post-capture DSL libraries for the six 
samples that were captured with both capture methods. 
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Fig. S6 

Read length distribution of pre-capture DSL, SSL and post-capture SSL The 
relative gain of short fragments in the pre-capture SSL library when compared to 
the pre-capture DSL library, is lost when the former are subjected to WGC. 
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Fig. S7 



Lower endogenous content in the pre-capture library results in higher clonality 
in the post-capture libraries and this relation can be modeled with a logarithmic 
function. WISC experiments are shown in blue and MYbaits in green. Each dot 
represents a sample and the number within depicts the total number of cycles 
used prior and after WGC. Lines represent the fitted logarithmic functions, which 
are shown in the plot. 
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SUPPLEMENTARY TABLES 
Supplementary Table 1 

Contamination estimates based on mitochondria show no evidence of increased 
contamination in WGC experiments despite the observed bias in these toward 
longer reads. Numbers show the point estimate of % contamination and 
numbers within parentheses represent 95% confidence intervals (CI). Estimates 
and CI were calculated as described in Material and Methods. 







DSL 




SSL 




Shotgun 


wise 


MYbaits 


Shotgun 


wise 


MYbaits 


STM1 


1.59 (0.24-4.8) 


0.44 (0.15-0.97) 


0.26 (0.03-0.82) 


0.99 (0.16-18.7) 


0.64 (0.11-1.43) 


1.29 (0.67-2.18) 


STM2 


6.27 (2.97-12.23) 


0.40 (0.07-0.99) 


0.20 (0.02-0.72) 


3.67 (0.59-63.82) 


0.10 (0.02-1.17) 


0.02 (0-0.44) 


STM3 


1.40 (0.3-3.84) 


0.06 (0.01-0.34) 


0.08 (0.01-0.37) 


0.53(0.09-11.1) 


0.30 (0.07-0.71) 


0.47 (0.15-0.47) 



11 



Supplementary Table 2 

Average GC content is shown per sample, type of library (SSL: single-stranded 
library) and experiment. All rows, except the bottom three belong to DSL 
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Supplementary Table 3 



Average lengths of reads within repeated regions in the human genome are 
larger than those of reads outside repeats. Average read length in bp is shown 
for each sample, type of library and experiment. All rows, except the bottom 
three belong to double-stranded libraries. Last three rows belong to single- 
stranded libraries (SSL). The preferential retrieval of longer fragments by WGC 
methods might explain why captured libraries exhibit a higher fraction of reads 
within repeats that pre-capture libraries. 
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